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Abstract 



(-H I In this paper we derive higher order convergence rates in terms of 

" ■ the Bregman distance for Tikhonov like convex regularisation for linear 

operator equations on Banach spaces. The approach is based on the idea 
of variational inequalities, which are, however, not imposed on the original 
Tikhonov functional, but rather on a dual functional. Because of that, 
the approach is not limited to convergence rates of lower order, but yields 
the same range of rates that is well known for quadratic regularisation on 
Hilbert spaces. 

csi ■ 1 Introduction 

^— V . Classical results in the theory of quadratic Tikhonov regularisation on Hilbert 

spaces show that the asymptotic quahty of the regularised solution of an ill- 
posed linear operator equation Ax = y depends strongly on the smoothness of 
the true solution x^ with respect to the operator A: V —5- V: li x^ satisfies the 
range condition 

X ■ ^^ e R&n{A*AY (1) 

H ; 

5^ , for some < z^ < 1 and the accuracy of the right hand side of the equation 

is of order S, then one obtains, with a suitable choice of the regularisation 
parameter, an accuracy of the regularised solution of order S^'^^'^^'^^^K More 
precisely, denote by y g V any right hand side satisfying ||y — y || < <^, and by 
x^ e V with a > and 5 > any element satisfying 

xi = a.i-gmm{\\ Ax - y^W^ + aWxW^ : x e X} 

for some y . Then a choice of the regularisation parameter a of order a ~ 
^2/(2,.+i) ijnpiies that ||a:^ ~x^\\ = 0((52.'/(2..+i)) (g^g fo^. instance [7] or [12]). 

The situation becomes more comphcated in the case where the regularisation 
term is not the squared Hilbert space norm and, even worse, the regularisation 



takes place only in a Banach space and not a Hilbert space setting. Then 
Tikhonov regularisation consists in the minimisation of a functional 

To,{x-y^) = \\Ax-y^\\^ + an{x), 

where TZ is some convex and lower semi-continuous functional on X. The first 
results that treated convergence rates for non-quadratic regularisation remained 
close to the quadratic case. In '19 and [4 , quadratic regularisation with con- 
vexity constraints was discussed, leading to conditions and results that are very 
similar to the unconstrained case. The first result that transcended the Hilbert 
space setting was the treatise i8i , where convergence rates for maximum entropy 
regularisation were derived. The main argument, however, was a translation of 
the non-quadratic problem to an equivalent quadratic problem on a Hilbert 
space. 

One of the main difficulty in the generalisation of the convergence rates result 
to general convex regularisation methods is that the norm on X, in which the 
convergence rate was usually measured, need not be related to the regularisation 
term that should imply the convergence. For this reason, it was argued in [3] 
that it is more natural to measure the quality of the approximation not in terms 
of the norm but rather in terms of the Bregman distance with respect to the 
regularisation term TZ. Then it is easy to derive, for a parameter choice a ^ 5, 
a convergence rate of order 5 provided the range condition Ran A* f\dTZ{x'^) ^ 
holds. Here dTZ{x'') denotes the sub-differential of the convex functional TZ at 
x'' . This result corresponds to the case z/ = 1/2 in ([ij. In addition, a result with 
the improved range condition Ran A* A n dTZ{x'') ^ with Y being a Hilbert 
space has been derived in [35| (see also [501 [H] for more general results). 

The above results on convergence rates for regularisation on Banach spaces 
correspond to the cases v = 1/2 or i/ = 1 in ([T]); none of the intermediate cases 
was treated in [3] or |22) . There are two main reasons: First, an intermediate 
range condition is only possible if fractional powers of A can be defined, which 
basically limits the theory to Hilbert spaces, and, second, the classical proofs of 
the intermediate convergence rates rely on the possibility of writing x^ as the 
solution of a linear operator equation. 

An alternative approach for the derivation of convergence rates in a quadratic 
Hilbert space setting was introduced in |16j (see also [23j , where a similar idea 
was used in a slightly different context) under the headings of approximate 
source conditions and distance functions. This approach was used with success 
in [TJ (note also the preliminary results in jlSj), leading to the whole range of 
estimates also available for the quadratic case. Another alternative, variational 
inequalities, arised in [17] (see also [24l Sec. 3.2]) from the search for a natural 
way of extending convergence rates results to non-linear problems. In [2J [TT] 
it was shown that these variational inequalities can be used for covering the 
whole range of slow convergence rates corresponding to the cases < f < 1/2 
in ([1]). The two concepts were also combined in |10| as approximate variational 
inequalities, but the obtained rates were not different from those in [21 [11] . 

In [5], the different new concepts for obtaining (slow) convergence rates for 
convex regularisation methods in Banach spaces were discussed and compared 
in the case of quadratic regularisation on Hilbert spaces. It turned out that, 
in this simple setting, they were equivalent. In more complicated situations, 
however, in particular for the regularisation of non-linear operators, but also for 



non-convex regularisation (see [U), variational inequalities might be easier to 
handle. It is, however, necessary to show that they can also be used for deriving 
fast convergence rates. 

In this paper, it is shown that this is indeed possible for linear operators on 
Banach spaces. The proof and also the formulation of the variational inequality 
are based on a duality argument that shows that, if the standard range condition 
Ran A* n dTZ{x^) holds, then the minimisation of the Tikhonov functional Ta 
with noisy data y can be translated to the approximative minimisation of a 
dual Tikhonov functional with exact data. In the special case of a sufficiently 
smooth regularisation term TZ with sufficiently smooth dual TZ*, the ensuing 
condition for the best possible rates with this method is almost identical to the 
range condition for the Hilbert space corresponding to the case j^ = 1 in ([T]). 

2 Notation 

Let X and Y be Banach spaces and A: X -^ Y a. bounded linear operator. 
Moreover assume that TZ: X -^ [0, +oo] is a convex, lower semi-continuous, and 
coercive regularisation functional and p > 1. Consider, for given data y G Y 
and a regularisation parameter a > 0, the Tikhonov functional T(-; a,y): X ^ 
[0, +CX3] defined by 

mx;y) -.^ -\\Ax ~ y\\P + aTZix) . 
P 

From now on we assume that y'' € Y are some fixed data, and we want to 
solve the equation 

Ax — y^ 

for X. By 

x' :G argmin{7?.(a;) : Ax = y^} 

we denote any 7^-minimising solution of the equation Ax — y^. Moreover, for 
a > we denote by 

Xa :G argmin{7^(x;2/') : x 6 X^ 

any minimiser of the Tikhonov functional with exact data y^ . Finally, for 5 > 
and a > we denote by 



„5 



Xn 



:G argmin{7^(a;;y ) : a; G X} , 



any minimiser of the Tikhonov functional with perturbed data y , where y E Y 
is any element satisfying 

\\y'-y^\<S. 

The main goal of this paper is the derivation of convergence rates, that is, 
estimates of the distance between x^ and x^ depending on the noise level d 
and the regularisation parameter a. Starting with JJ, such estimates have 
been usually derived in terms of the Bregman distance defined by the convex 
regularisation functional TZ. 



Definition 2.1. Assume that x £ X and ^ G dTZ{x). Then we define the 
Bregman distance with respect to TZ as the mapping 2?^(-; a;) ; X ^> [0, +00], 

T>^{x; x) — TZ{x) — TZ{x) — {^,x — x) . 

That is, the Bregman distance measures the difference between the graph of 7?. 
and its afhne approximation at x. 

In addition, ii x, x £ X and ^ e dTZ{x), ^ G dTZ{x), then we denote by 

the symmetrical Bregman distance between x and x with respect to TZ. An easy 
computation shows that the identity 



V'^l\x,i) = {^-^,x-i) 



holds. 



Note that the Bregman distance at x depends on the choice of the sugradient 
^ G dTZ{x) (unless dTZ{x) contains a single element). Moreover it can happen 
that dTZ(x) = 0, in which case the Bregman distance cannot be defined. 

3 Duality Mappings 

In addition to the Bregman distance with respect to TZ, we will also need the 
Bregman distance with respect to the p-th power of the norm on the space Y. 
This Bregman distance can be written in terms of duality mappings. In this 
section we recall several results concerning duality mappings that will be needed 
for the derivation and interpretation of the main result on convergence rates. 

Throughout the whole text, whenever g > 1 we denote by q^ its conjugate 
defined by 1/q + 1/q^ = 1. 

Definition 3.1. Let q > 1. The duality mapping Jq:Y—>-2^ is defined by 

JM-{^^y*--{^-^y) = M\\y\l M = \\y\r'}- 

The adjoint duality mapping J* : Y* — > 2^ is defined by 

^;H = {y e y : (a.,2/) = ||..||||y||, \\y\\ = M'''-'} . , 

If Y is reflexive, one can show that J* — {Jq)~^ (see JS] Chap. 1, Thm. 4.4]). 
Moreover it is easy to see that Jq is {q — l)-homogeneous. That is, if y G F and 
A G M, then 

Jq{\y) = \Xr'Jq{y) . 

Lemma 3.2. Let q > 1 and denote by Sq : Y ^)- [0, +00] the mapping 

Sqiy)^^\\yr. 

Then 

Jq{y) = dSq{y) . 



Proof. See Chap. 2, Cor. 3.5]. D 

Recall that the Banach space Y is smooth (see [H]), if for every y E Y 
satisfying ||y|| = 1 there exists a unique to G Y* such that \\uj\\ = {uj,y) = 1. 
Equivalently, Y is smooth, if and only if every duality mapping Jq on Y with 
g > 1 is single valued. In this case, we regard Jq as a mapping from Y to Y* . 
This allows us to formulate the Bregman distance with respect to Sq in terms 
of the duality mapping Jq . 

Definition 3.3. Assume that F is a smooth Banach space and let q > 1. For 
y £Y wc define the Bregman distance with respect to Sq at y by 

Vq{y;y):=-^\\y\\'>~-Jy\\'^-{Jq{y),y~y). 

Moreover we define the symmetric Bregman distance with respect to Sq as 

Vf^^y,y):^{Jq{y)-Jq{y),y-y). , 

The following definition is taken from |18] . 
Definition 3.4. The modulus of convexity Sy '■ (0, 2] -^ [0, 1] is defined by 
Sy{s) := inf {1 - ||y - y||/2 : y, y e Y, \\y\\ = \\y\\ = 1, \\y - y\\ = e} . 
The modulus of smoothness py ■ (0, +oo) -^ (0, +oo) is defined by 

Py{t) = sup{{\\y + y\\ + \\y- y\\)/2 - 1 : y, y e F, \\y\\ ^ 1, ||y|| = r} . 

The space Y is called uniformly convex if <5y(e) > for every £ > 0. It is called 
uniformly smooth if py(t) = o{t) as t — >■ 0. 

The space Y is called q-convex (or convex of power type q), if there exists a 
constant K > such that (5y(e) > Ke'' for all e. Similarly, it is called q-smooth 
(or smooth of power type g), if py{t) < Kt'^ for all sufficiently small r > 0. ■ 

One can show that the space Y is g-convex, if and only if Y* is g*-smooth 
(in fact, the relation 2py. — {2Sy)* holds, see [TSl Prop, l.e.2]). 

For us, the most important property of g-convex spaces is that they allow 
to estimate the q-th power of the norm from above by the Bregman distance. 
More precisely, the following result holds: 

Lemma 3.5. Assume that the Banach space Y is q-convex. Then there exists 
a constant C > such that 

C\\y-y\\'^<Vq{y;y). 

for all y, y eY. 

Proof See [H Lemma 2.7]. D 



4 Convergence Rates 

In the following we always assume that a;^ satisfies a range condition with 

Then the definition of the dual TZ* : X* — > (— cx3,+oo] of 7^ implies that 

Consequently we can define a Bregman distance 'D*^{-]A*uj'^): X* — > [0,+oo] 
for the dual function TZ* as 

Vl,{tA*uj^) = n*{C) - TZ*{A*J) - (e - A*Lj^,x^) . 

Proposition 4.1. Assume that x^ minimises Ta{-',y )■ Then there exists uf^ 6 
Y* such that 

A*L,i e dUixi) , -acui = JM^i - y') . (2) 

Moreover ui^ minimises the functional T^{-] y ) '■ Y* — >■ (— oo, +cxd] defined by 

r:{uj;y'):^V:,iA*u;A*u^)+a'''-'-\\u\r -{oj~Lo\y' -y^) . 

Proof. The dual of the problem 

Ta(x)/a = —\\Ax - /f + n(x) ^ min 
ap 



is the problem 



— \\aio\\P' -{LJ,y^)+n*(A*uj)^mm 



(see [HI Chap. III]). Writing 

-{uj,y') = {u;,y^ ~y') - {uj,y^) ^ ~{u;,y' ~y^) - {A*^, x^) 

and adding the constant terms {A*uj\x^), {uj\y^ ~ 2/^), and —TZ*{A*uj^), we 
obtain the problem 

aP'~^ — \\uj\\P'+n*{A*Lj)-n*{A*Lj^) 
p* 

- {A*LU - A*uj^,x^) - {lj -u\y^ - y^) ^ min , 

which is precisely the minimisation of the dual Tikhonov functional T*. The 
relations ^ are nothing else than the Karush-Kuhn-Tucker conditions for the 
minimisation problem (cf. [;6j Chap. III]). □ 

Remark 4-2. In particular, Proposition 14.11 implies that, if x^ minimises the 
primal Tikhonov functional with noisy data y , then w^ almost minimises the 
dual Tikhonov functional 7^*(-; y^) with exact data y^ . More precisely, we have 

T:U;y^) < inf{r;(c.;yt) : ,. e Y*} + S\\u:i - u;^\\ . , 



Definition 4.3. An index function is a strictly increasing, continuous and con- 
cave function $: [0, +00) — > [0, +00) satisfying $(0) =0. ■ 

Theorem 4.4. Assume that Y is ap-smooth Banach space and that there exists 
an index function $ such that 

{uj-uj\jp,{uj^))<^{Vl,{A*u;;A*u;^)) (3) 

for every w G Y* . Denote by ^ the conjugate of the convex mapping t 1-^ $~^ (i) . 
Then there exists a constant D only depending on p and the Banach space Y 
such that 

a 
Proof. We have 

^r:^A..t(^n4) = {A*u;i~A*u\xi~x^) 

^{u;i~u;\Axi~Ax^} 

= {Loi - J,Axi - /) + {L^i -Lo\y'- yt) . 

Proposition 14. 1 1 implies that —auj^ = Jp{Ax^ — y^), and therefore 
Consequently 

+ {ioi~ojl^/~y^) 
Using ([31) and estimating (w^ — uj^,y^ — y^) < S\\u!^ — w'''||, we obtain 

<aP'-'<P{v:,{A*coi-A*co^))^aP-^-'v;i'^{u;i,cu^) + S\\coi-co^\\ . (4) 

Now, 

^T-™^,A*^t (2:^ a;^) = P^.^t (a;i; x^) + P^.^j {x'';xi) , 

and the assumptions A*uj^ E dTZ{x^) and A*oj^ G dTZ{x^) imply that 

Va^^s^ (xt; 4) = n{x^) - n{xi) - {A*u;lx^ - xi) 

= {A*Lo\x^)-n*{A*Lo^) - {A*ujlxi)+'R*{A*uji) 

~{A*ulx^~xi) 
= n*{A*uji) - 7^*(c^t) _ {A*{iji~uj^),x^) 
^V:,{A*Loi;A*Lo^). 

Because Y is p-smooth, its dual space Y* is p*-convex. Thus there exists C > 
such that (see Lemma [33)1 



CK-..t||P. <psy„.(^.^^t). 



Consequently Q implies that 

+ 5\\ui-u^\-Ca^'-^\\u:i-J\r 
Applying Young's inequality (see [T5] Thm. 13.2]), we obtain 

^^{aP'-^)+Vl,{A*Lji;A*Lj^) 
and 



S\U - u;t|| < CaP'-'\U - cv^r + —rj^^ 



gP 



ppV^'CP/P' a 

Therefore 

SP 



2?A..t(xi;xt)<*K--i) + ^^ 



which proves the assertion. D 

Corollary 4.5. Assume that the assumptions of Theorem \4-4\ or^ satisfied. 
Then we have for exact data y^ the estimate 

In the special case where $(t) = ci^' ''* for some q* > 1 and c > we have 

Proof. In the general case, the assertion directly follows from Theorem 14.41 for 
S — 0. In the case $(t) = ct^^'^*, the mapping ^ is the conjugate of $~^(i) — 
f* /c^* . A short calculation shows that ^ indeed has the form stated above. D 

Corollary 4.6. Assume that the assumptions of TheoremlJ^are satisfied with 
an index function $(i) ~ ^1/9* for some q^, > 1. Then we have for a parameter 
choice 

the convergence rate 

Proof. From Theorem 14.41 we obtain the estimate 

xp 
VA,^,ixi;x^) <^iaP'-') + D- , 

a 

where ^ is the conjugate of the function $^^. Since $(i) '--^ t^/''* , it follows that 
$-1 - <«* , and therefore * ~ i«. Thus 

-r , „ _1. ^SP („ _,-.„ 6P Plp,~l)q P P(p,-l)q 

\[/(q,P» '^) + D r^ a^P* '^ -\ r~^ S(p,-l)q+l -\^ gf (p, -l)q + l r^ S(p, -l)g+l ^ 

a a 

which proves the assertion, as p(p* — 1) = p*. D 



5 Implications of Smoothness 

In this section, we first discuss the hmitations of Theorem 14.41 in the case where 
the dual of TZ is smooth. Then the best possible rates imply that the dual 
variable w^ satisfies the range condition Jp^{uj^) G Ran A. Conversely, this 
range condition implies rates, if the functional TZ itself is sufficiently smooth. 

Lemma 5.1. Assume that TZ* is two times Frechet differentiahle at A*uj'^ and 
that there exists an index function $ such that 

{uj-u:\ Jj,, [lo^)) < $(P;t (^*'^; A*cj^)) (5) 

for all uj £ Y* . If $(i) = o(i^") as i — ?> 0, then a;^ minimises TZ. 
In addition, if one can choose $(t) ^ t^" as t ^ 0, then 

Jp,{uj'') e Ran A . 

Proof. Because 7^* is two times Frechet differentiahle at A*uj'^ , a Taylor expan- 
sion of 7?,* around A*uj'' shows that 

Vl^{A*uj;A*uj^) ^ TZ*{A*uj) - TZ*{A*uj^) - {x\A*{uj~uj^)) 

= {TZ*)"{A*J){A*{oj - cjt); A*{lu - wt)) + o{\\A*{lu - wt)||2) , 

where {TZ*)"{A*uj'') e B{X*) denotes the second order derivative of 7?.* at A*uj\ 
which is a symmetric, bounded bilinear form on X* . Writing Co := u — La\ the 
inequality ([S]) implies that 

{u, Jp, (wt)) < ^{{TZ*)"{A*Lo^){A*Co- A*Co) + o{\\A*uf)) 

for all w e X*. In addition, one can estimate 

{TZ*)"{A*Cj-A*Cj) < \\{TZ*)"iA*Lu^)\\\\A*Q\\^ . 

Therefore, 

(ci,J,.(c.t))<$(||(7e*)"(AV)||||A*^f +o(||A*^f)). (6) 

Now assume that <I>(t) ^ o{t^/^) as t ^ 0. Then, dividing © by \\A*uj\\ and 
considering the limit ||A*(i|| — ;> 0, we see that Jp^ {uj'') = 0, which is equivalent to 
stating that a;t = 0. Thus — A*uj'' e dTZ{x''), which proves that z^ minimises 
TZ. 

Now assume that $(t) ^ t^^^ as t — ?► 0. Then ^ implies that there exists a 
constant C > such that 

{^,JpA^^))<C{\\iTZriA*uj^)\\f'\\A*Cj\\ 

for ||A*(I)|| sufficiently small — and thus everywhere, as the right hand side is 
positively homogeneous and the left hand side is linear. Thus [231 Lemma 8.21] 
implies that Jp^(aj^) e Ran A. D 

In the special case, where X is a q-convex Banach space and 

7e(x) = l||xr, 



the above result can be slightly refined. In this case, the condition of Theo- 
rem HH] reads as 



(cj - a;^ Jp. (w^)) < $(l?g, (A*tj; A*w^)) 
and the resulting estimate is 

a 
Lemma 5.2. Assume that X is q-convex and 

7^(:r) = i||x||^ 

and that 

{uj-uj\ Jp, {uj^)) < $(!?,. iA*oj; A*oj^)) 

for some index function $. // 

$(t) = o(i^/''*) asq^O, 

then x^ — 0. 
Proof. We have 

^{Vg,{A*u;;A*J))<^{C,J\A*{uj-J)\\'") . 
Thus the assumption <i>(t) — o{t^''^') implies that, writing a) := cj — a;^. 

This, however, is only possible, if Jp^{Lo'^) = 0, which is equivalent to stating 
that w'l' = 0. Thus = A*uj'^ e Jg(a;t), and therefore a;t = 0. D 

Lemma 5.3. Assume that TZ is two times Frechet differentiahle at x' and that 
Jp^(aj^) G RanA. Then there exists C > such that 

{uj-uj\ Jp, {u^)) < C{Vl, {A*uj- A*uj^)) '/' 

for u! sufficiently close to oj^. 

Proof. Because 7?. is two times Frechet differentiahle at x^, there exists c > 
such that 

nix) < nix"^) + {A*uj\x~ xt) + ^\\x- x^f 

for every x ^ X sufhciently close to x'^ . Consequently, 

n*{A*uj) > n*{A*uj^) + {A*{uj - uj^), x^) + —\\A*{lj ~ u^)f 

2c 

for uj sufficiently close to w^. This, however, is equivalent to stating that, locally 
around a;^, 

V:,{A*co;A*u^) > ^JA*{uj ^ Lo^)r . 
Now the assumption that Jp, (a;^) g Ran A is equivalent to the estimate 

{0J-0j\jp,{LO^)) <d\\A*{uj-uj^)\\ 

for some c > and all uj ^ Y* . Assembling these inequalities, the assertion 
follows. D 



10 



6 Conclusion 

In this paper it is shown that the approach of variational inequahties can be used 
for the derivation of higher order convergence rates and is thus not restricted 
to the "low rate world" as has been surmised in [TU]. The basic idea for this 
generalisation is the formulation of the variational inequality not for the primal 
Tikhonov functional, but rather for a dual functional. By this approach we 
obtain the whole range of convergence rates that have already been derived for 
quadratic regularisation and also for convex regularisation on Banach spaces. 
The main advantage of the usage of variational inequalities is their comparative 
simplicity, in particular when used in conjunction with non-linear operators; 
they have been introduced precisely for the study of non-linear ill-posed operator 
equations. Thus it seems reasonable that the approach can be extended also to 
the non-linear case without introducing too many artificial constraints on the 
operator. 
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